007_002_lab_DQN 2 (Nature 2015).html # https://www.youtube.com/watch?v=ByB49iDMiZE&list=PLlMkM4tgfjnKsCWav-Z2F-MMFRx-2gMGG&index=16 # @ # DQN (2013): # 1. Go deep # 1. You use replay memory storing values # to resolve correlations between samples # DQN (2015): # 1. Separate networks, # to resolve non-stationary target issue # Core idea of DQN 2013 # img 2018-04-29 16-50-01.png # # Core idea of DQN 2015 # img 2018-04-29 16-50-37.png # # @ # How to create separated networks # img 2018-04-29 16-51-24.png # # @ # DQN vs targetDQN # img 2018-04-29 16-52-22.png # # @ # How to handle two networks with codes # img 2018-04-29 16-54-35.png # # @ # Copying network means copying value of weights # img 2018-04-29 16-56-59.png # # @ # Summary # 1. You create 2 networks # 1. You make target same with main network # target=mainNet # 1. Environment, loop, ... # When you create y, you will use target network # You will update main network by using y # 1. You make target network same with main network, # by assigning main network into target network # @ # Code related to replay train (targetDQN added) # img 2018-04-29 17-01-39.png # # @ # Code related to copy network (variable) # img 2018-04-29 17-02-23.png # # Code related to bot play # img 2018-04-29 17-02-57.png # # Code related to main() # img 2018-04-29 17-03-49.png # # Exercise 1 # Tuning hyper parameters (learning rate, sample size, decay factor) # Network structure # add bias # test tanh, sigmoid, relu, etc # improve TF network to reduce sess.run() calls # Reward redesign # img 2018-04-29 17-06-37.png # # Exercise 2 # Car race with DQN 2015 # img 2018-04-29 17-07-20.png # # Exercise 3 # DQN implementtations # Other games # RMA approach # img 2018-04-29 17-08-00.png #